间接歧视是算法模型中主要关注的问题。在保险定价中尤其如此,不允许使用保护保单持有人特征进行保险定价。简单地忽略受保护的保单持有人的信息不是一个适当的解决方案,因为这仍然允许从非保护特征中推断出受保护特征的可能性。这导致所谓的代理或间接歧视。尽管代理歧视在质量上与机器学习中的集体公平概念不同,但提出了这些群体公平概念,以“平滑”受保护特征在计算保险价格中的影响。本说明的目的是根据保险定价分享有关团体公平概念的一些想法,并讨论其含义。我们提出了一个没有替代歧视的统计模型,因此从保险定价的角度来看,没有问题。但是,我们发现该统计模型中的规范价格无法满足三个最受欢迎的集体公正公理中的任何一个。这似乎令人困惑,我们欢迎对我们的示例和这些集体公正公理对非歧视性保险定价的有用性的反馈。
translated by 谷歌翻译
通常,Gini索引没有提供一致的评分规则。因此,最大化Gini指数可能会导致错误的决定。主要问题是GINI指数是基于排名的分数,对校准敏感。我们表明,如果我们将其限制在自动校准的回归模型的类别中,则GINI索引允许其一致评分。
translated by 谷歌翻译
在预测建模的应用中,例如保险定价,间接或代理歧视是一个重大关注的问题。也就是说,存在一种受保护的保单持有人特征被预测模型隐含地推断出的受保护保单持有人特征的可能性,因此对价格产生了不良(或非法)的影响。解决此问题的技术解决方案依赖于使用所有保单持有人特征(包括受保护的人)建立最佳模型,然后平均为计算个人价格的受保护特征。但是,这种方法需要对保单持有人的受保护特征的充分了解,这本身可能是有问题的。在这里,我们通过使用多任务神经网络体系结构进行索赔预测来解决此问题,该预测只能使用有关受保护特征的部分信息进行培训,并且它产生的价格没有代理歧视。我们证明了所提出的模型的使用,我们发现其预测精度与常规的前馈神经网络相媲美(完整信息)。但是,在部分缺少保单持有人信息的情况下,这个多任务网络显然具有出色的性能。
translated by 谷歌翻译
不可能的定理表明,如权利要求中所述,不能解决特定问题或一组问题。这些定理对人工智能有可能进行限制,特别是超级智能人员。因此,这些结果担任AI安全,AI政策和治理研究人员的指导方针,提醒和警告。这些可能在规范满足框架内的形式使某些长期问题的解决方案能够在不致力于一种选择的情况下进行规范化理论。在本文中,我们对AI领域的不盘定定理分为五类:扣除,欺诈性,归纳,权衡和难治性。我们发现某些定理太具体或具有限制应用的隐含假设。此外,我们为释放性的不公平添加了新的结果(定理),归纳类别中的第一个解释性相关结果。我们得出结论,扣除减免否认100%的保安。最后,我们给出了一些思想,以持有可解释性,可控性,价值对准,道德和团体决策的潜力。他们可以通过进一步调查来加深。
translated by 谷歌翻译
In this paper we explore the task of modeling (semi) structured object sequences; in particular we focus our attention on the problem of developing a structure-aware input representation for such sequences. In such sequences, we assume that each structured object is represented by a set of key-value pairs which encode the attributes of the structured object. Given a universe of keys, a sequence of structured objects can then be viewed as an evolution of the values for each key, over time. We encode and construct a sequential representation using the values for a particular key (Temporal Value Modeling - TVM) and then self-attend over the set of key-conditioned value sequences to a create a representation of the structured object sequence (Key Aggregation - KA). We pre-train and fine-tune the two components independently and present an innovative training schedule that interleaves the training of both modules with shared attention heads. We find that this iterative two part-training results in better performance than a unified network with hierarchical encoding as well as over, other methods that use a {\em record-view} representation of the sequence \cite{de2021transformers4rec} or a simple {\em flattened} representation of the sequence. We conduct experiments using real-world data to demonstrate the advantage of interleaving TVM-KA on multiple tasks and detailed ablation studies motivating our modeling choices. We find that our approach performs better than flattening sequence objects and also allows us to operate on significantly larger sequences than existing methods.
translated by 谷歌翻译
Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases. Technological developments to increase the speed of acquisition often results in systems with a narrower spectral bandwidth, and hence a lower axial resolution. Traditionally, image-processing-based techniques have been utilized to reconstruct subsampled OCT data and more recently, deep-learning-based methods have been explored. In this study, we simulate reduced axial scan (A-scan) resolution by Gaussian windowing in the spectral domain and investigate the use of a learning-based approach for image feature reconstruction. In anticipation of the reduced resolution that accompanies wide-field OCT systems, we build upon super-resolution techniques to explore methods to better aid clinicians in their decision-making to improve patient outcomes, by reconstructing lost features using a pixel-to-pixel approach with an altered super-resolution generative adversarial network (SRGAN) architecture.
translated by 谷歌翻译
Real-life tools for decision-making in many critical domains are based on ranking results. With the increasing awareness of algorithmic fairness, recent works have presented measures for fairness in ranking. Many of those definitions consider the representation of different ``protected groups'', in the top-$k$ ranked items, for any reasonable $k$. Given the protected groups, confirming algorithmic fairness is a simple task. However, the groups' definitions may be unknown in advance. In this paper, we study the problem of detecting groups with biased representation in the top-$k$ ranked items, eliminating the need to pre-define protected groups. The number of such groups possible can be exponential, making the problem hard. We propose efficient search algorithms for two different fairness measures: global representation bounds, and proportional representation. Then we propose a method to explain the bias in the representations of groups utilizing the notion of Shapley values. We conclude with an experimental study, showing the scalability of our approach and demonstrating the usefulness of the proposed algorithms.
translated by 谷歌翻译
The previous fine-grained datasets mainly focus on classification and are often captured in a controlled setup, with the camera focusing on the objects. We introduce the first Fine-Grained Vehicle Detection (FGVD) dataset in the wild, captured from a moving camera mounted on a car. It contains 5502 scene images with 210 unique fine-grained labels of multiple vehicle types organized in a three-level hierarchy. While previous classification datasets also include makes for different kinds of cars, the FGVD dataset introduces new class labels for categorizing two-wheelers, autorickshaws, and trucks. The FGVD dataset is challenging as it has vehicles in complex traffic scenarios with intra-class and inter-class variations in types, scale, pose, occlusion, and lighting conditions. The current object detectors like yolov5 and faster RCNN perform poorly on our dataset due to a lack of hierarchical modeling. Along with providing baseline results for existing object detectors on FGVD Dataset, we also present the results of a combination of an existing detector and the recent Hierarchical Residual Network (HRN) classifier for the FGVD task. Finally, we show that FGVD vehicle images are the most challenging to classify among the fine-grained datasets.
translated by 谷歌翻译
Three main points: 1. Data Science (DS) will be increasingly important to heliophysics; 2. Methods of heliophysics science discovery will continually evolve, requiring the use of learning technologies [e.g., machine learning (ML)] that are applied rigorously and that are capable of supporting discovery; and 3. To grow with the pace of data, technology, and workforce changes, heliophysics requires a new approach to the representation of knowledge.
translated by 谷歌翻译
In the Earth's magnetosphere, there are fewer than a dozen dedicated probes beyond low-Earth orbit making in-situ observations at any given time. As a result, we poorly understand its global structure and evolution, the mechanisms of its main activity processes, magnetic storms, and substorms. New Artificial Intelligence (AI) methods, including machine learning, data mining, and data assimilation, as well as new AI-enabled missions will need to be developed to meet this Sparse Data challenge.
translated by 谷歌翻译